Web Scale Competitor Discovery Using Mutual Information
نویسندگان
چکیده
The web with its rapid expansion has become an excellent resource for gathering information and people’s opinion. A company owner wants to know who is the competitor, and a customer also wants to know which company provides similar product or service to what he/she is in want of. This paper proposes an approach based on mutual information, which focuses on mining competitors of the entity(such as company, product, person ) from the web. The proposed techniques first extract a set of candidates of the input entity, and then rank them according to the comparability, and finally find and organize the reviews related to both original entity and its competitors. A novel system called ”CoDis” based upon these techniques is implemented, which is able to automate the tedious process in a domain-independent and web-scale dynamical manner. In the experiment we use 32 different entities distributed in varied domains as inputs and the CoDis discovers 143 competitors. The experimental results show that the proposed techniques are highly effective.
منابع مشابه
Automatic Discovery of Technology Networks for Industrial-Scale R&D IT Projects via Data Mining
Industrial-Scale R&D IT Projects depend on many sub-technologies which need to be understood and have their risks analysed before the project can begin for their success. When planning such an industrial-scale project, the list of technologies and the associations of these technologies with each other is often complex and form a network. Discovery of this network of technologies is time consumi...
متن کاملExploring Relevance as Truth Criterion on the Web and Classifying Claims in Belief Levels
The Web has become the most important information source for most of us. Unfortunately, there is no guarantee for the correctness of information on the Web. Moreover, different websites often provide conflicting information on a subject. Several truth discovery methods have been proposed for various scenarios, and they have been successfully applied in diverse application domains. In this paper...
متن کاملFocused Crawling for Retrieving Chemical Information
The exponential growth of resources available in the Web has made it important to develop instruments to perform search efficiently. This paper proposes an approach for chemical information discovery by using focused crawling. The comparison of combination using various feature representations and classifier algorithms to implement focused crawlers was carried out. Latent Semantic Indexing (LSI...
متن کاملWSDL Retrieval for Web Services Based on Hybrid SLVM
Recently, two operable WSDL retrieval approaches, bipartite-graph matching and KbSM, were developed for Web service discovery. But their models and similarity metrics of WSDL ignore some term or semantic feature, and involve formal method problem of representation or difficulty of parameter verification. SLVM approaches depend on statistical term measures to implement XML document representatio...
متن کاملExpert Discovery: A web mining approach
Expert discovery is a quest in search of finding an answer to a question: “Who is the best expert of a specific subject in a particular domain within peculiar array of parameters?” Expert with domain knowledge in any field is crucial for consulting in industry, academia and scientific community. Aim of this study is to address the issues for expert-finding task in real-world community. Collabor...
متن کامل